In this project I'm going to detect lane lines using gradient and color features of the image on the given images and videos. I'm going through the following steps in order to achieve the project's goal:
I'm starting from loading necessary libraries.
import cv2
import matplotlib.pyplot as plt
import glob
import numpy as np
import datetime
from class_camera import *
from class_image import *
%matplotlib inline
%%javascript
IPython.OutputArea.auto_scroll_threshold = 9999;
For self-driving cars we might want to use information about lane lines, as well as their curvature, in order to choose the correct steering angle and throttle/break values. Therefore, we need to detect lane lines accurately. But before doing this, we should make sure, that images, we're working with, accurately represent the world.
Lenses add some distortion to the images or video stream. Geometric camera calibration estimates the parameters of a lens and image sensor of an image or video camera. We can then use these parameters to correct for lens distortion, measure the size of an object in world units.
Camera parameters include intrinsics, extrinsics, and distortion coefficients. To estimate the camera parameters, we need to have 3-D world points and their corresponding 2-D image points. We can get these correspondences using multiple images of a calibration pattern, such as a checkerboard. Using the correspondences, you can solve for the camera parameters. [1]
The images of checkerboard, taken by the same camera as in the car, were provided by Udacity. I'm setting manually how many vertices are there on the most of the images (6 * 9 = 54), and OpenCV's function cv2.findChessboardCorners() finds the vertices by itself.
I've defined class Camera in the file 'class_camera.py' in order to calculate and store all coefficient and transformation matrices.
I start by preparing "object points" (lines 15 - 32), which will be the (x, y, z) coordinates of the chessboard corners in the world. Here I am assuming the chessboard is fixed on the (x, y) plane at z=0, such that the object points are the same for each calibration image. Thus, objp is just a replicated array of coordinates, and objpoints will be appended with a copy of it every time I successfully detect all chessboard corners in a test image. imgpoints will be appended with the (x, y) pixel position of each of the corners in the image plane with each successful chessboard detection. I then used the output objpoints and imgpoints to compute the camera calibration and distortion coefficients using the cv2.calibrateCamera() function. [2]
Next function undist(img) applies distortion correction to the images, using calculated early coefficients and function cv2.undistort
I applied this distortion correction to the test image using the cv2.undistort() function and obtained this result:
cam = Camera()
fname = '../CarND-Advanced-Lane-Lines/camera_cal/calibration5.jpg'
img = load_img(fname)
show_2gr(img, cam.undist(img),'Original image','Undistorted image')
fname = '/Users/olegair/Documents/MyNewCareer/SDCND/p4/CarND-Advanced-Lane-Lines/test_images/straight_lines1.jpg'
img = load_img(fname)
show_2gr(img,cam.undist(img),"Original image", "Distortion correction")
I'm going to use gradient and color features of the image to detect the lane lines. I've defined class Image in file 'class_image.py' to store image's attributes and describe image transformation functions:
| Function name | Output | Lines |
|---|---|---|
| abs_sobel_thresh | applies Sobel operator in x or y direction, takes absolute value and applies threshold | 92-116 |
| mag_thresh | applies threshold on the magnitude of the gradient in x and y directions | 118-136 |
| dir_threshold | applies threshold on the direction of the gradient (calculate as arctan of absolute value of gradient in x and y directions) | 138-157 |
| binary_and | returns intersection of 2 binary images | 23-31 |
| binary_or | returns union of 2 binary images | 32-41 |
| binary_substr | returns substr of second image from first | 43-49 |
| ch_threshold | applies threshold on single channel of multi-channel image and returns binary image | 9-21 |
Below I'm going through the process of applying thresholds on different options of image gradient
fname = '/Users/olegair/Documents/MyNewCareer/SDCND/p4/CarND-Advanced-Lane-Lines/test_images/test5.jpg'
img = cv2.cvtColor(cv2.imread(fname),cv2.COLOR_BGR2RGB)
img = undist(img)
gray = cv2.cvtColor(img, cv2.COLOR_RGB2GRAY)
sobelx = cv2.Sobel(gray, cv2.CV_64F, 1, 0)
sobely = cv2.Sobel(gray, cv2.CV_64F, 0, 1)
abs_sobelx = np.absolute(sobelx)
abs_sobely = np.absolute(sobelx)
scaled_sobelx = np.uint8(255*abs_sobelx/np.max(abs_sobelx))
scaled_sobely = np.uint8(255*abs_sobely/np.max(abs_sobely))
sxbinary = abs_sobel_thresh(img, orient='x', thresh_min=20, thresh_max=100)
sybinary = abs_sobel_thresh(img, orient='y', thresh_min=20, thresh_max=100)
mag_binary = mag_thresh(img, sobel_kernel=31, mag_thresh=(30, 100))
dir_binary = dir_threshold(img, sobel_kernel=15, thresh=(0.7, 1.3))
fname = '/Users/olegair/Documents/MyNewCareer/SDCND/p4/CarND-Advanced-Lane-Lines/test_images/test5.jpg'
img = cam.undist(load_img(fname))
img_obj = Image(img)
gray = cv2.cvtColor(img, cv2.COLOR_RGB2GRAY)
sobelx = cv2.Sobel(gray, cv2.CV_64F, 1, 0)
sobely = cv2.Sobel(gray, cv2.CV_64F, 0, 1)
abs_sobelx = np.absolute(sobelx)
abs_sobely = np.absolute(sobelx)
scaled_sobelx = np.uint8(255*abs_sobelx/np.max(abs_sobelx))
scaled_sobely = np.uint8(255*abs_sobely/np.max(abs_sobely))
sxbinary = img_obj.abs_sobel_thresh(orient='x', thresh_min=20, thresh_max=100)
sybinary = img_obj.abs_sobel_thresh(orient='y', thresh_min=20, thresh_max=100)
mag_binary = img_obj.mag_thresh(sobel_kernel=31, mag_thresh=(30, 100))
dir_binary = img_obj.dir_threshold(sobel_kernel=15, thresh=(0.7, 1.3))
show_2gr(img, gray,'Original image','Grayscale image')
show_2gr(sobelx,sobely,'Sobel operator X','Sobel operator Y')
show_2gr(abs_sobelx,abs_sobely,'Sobel operator X, absolute','Sobel operator Y, absolute')
show_2gr(scaled_sobelx,scaled_sobely,'Sobel operator X, scaled','Sobel operator Y, scaled')
show_2gr(sxbinary, sybinary,'Sobel operator X, scaled, thresholded','Sobel operator Y, scaled, thresholded')
show_2gr(mag_binary,dir_binary,'Magnitude of the Gradient, thresholded','Direction of the Gradient, thresholded')
Using the function above, I was looking for the combination of thresholds and gradient operation in order to get the clearest image of the lane lines. Here is the example of applying Sobel operator in x direction:
img_obj.show_thresh(slide=5, step=25, top=80, cols=2, method=0)
Sobel operator in x direction, thresholded (25,55):
gradx = img_obj.abs_sobel_thresh("x", 25, 55)
show_2gr(img,gradx, 'Original image', 'SobelX(25,55)')
Thresholded gradient's direction (1.07:pi/2)
gradd = img_obj.dir_threshold(sobel_kernel=9, thresh=(1.0721,np.pi/2))
gradx_d = binary_substr(gradx, gradd)
show_2gr(gradd, gradx_d, 'Thresholded gradient direction(1.0721, pi/2)', 'SobelX - Thresholded direction(1.0721,pi/2)')
In addition to gradient features of the image, I'm considering also to benefit from color features and experimenting with channels of HLS image representation.
hls = cv2.cvtColor(img,cv2.COLOR_RGB2HLS)
H_hls = hls[:,:,0]
L_hls = hls[:,:,1]
S_hls = hls[:,:,2]
f, (ax1) = plt.subplots(1,1)
f.tight_layout()
ax1.imshow(img)
ax1.set_title('Original Image')
f, (ax1, ax2, ax3) = plt.subplots(1, 3, figsize=(24, 9))
f.tight_layout()
ax1.imshow(H_hls, cmap='gray')
ax1.set_title('H', fontsize=50)
ax2.imshow(L_hls, cmap='gray')
ax2.set_title('L', fontsize=50)
ax3.imshow(S_hls, cmap='gray')
ax3.set_title('S', fontsize=50)
With the similar thresholding approach, as was used with gradient, I've been trying to find thresholds which might separate lane lines from other objects on the image.
s_th = ch_threshold(hls, 2, thresh=(150,255))
show_2gr(img, s_th, 'Original image','Saturation channel of the image, thresholded')
h_th = ch_threshold(hls, 0, thresh=(25,180))
show_2gr(img,h_th,'Original image','Hue channel of the image, thresholded')
s_h = binary_substr(s_th,h_th)
show_2gr(s_th, s_h,'Saturation channel, thresholded(150:255)', 'Saturation(150:255) - Hue(25:180)')
Another representation of RGB image is HSV. I found useful to apply color filters to the images - f.ex. yellow filter to separate yellow line.
hsv = cv2.cvtColor(img,cv2.COLOR_RGB2HSV)
H_hsv = hsv[:,:,0]
S_hsv = hsv[:,:,1]
V_hsv = hsv[:,:,2]
f, (ax1) = plt.subplots(1,1)
f.tight_layout()
ax1.imshow(img)
ax1.set_title('Original Image')
f, (ax1, ax2, ax3) = plt.subplots(1, 3, figsize=(24, 9))
f.tight_layout()
ax1.imshow(H_hsv, cmap='gray')
ax1.set_title('H', fontsize=50)
ax2.imshow(S_hsv, cmap='gray')
ax2.set_title('S', fontsize=50)
ax3.imshow(V_hsv, cmap='gray')
ax3.set_title('V', fontsize=50)
In the function manage_yellow (class_image.py, lines: 188-202) I've defined the range of yellow color in HSV. It helps me find solid yellow lane lines.
yellow = img_obj.manage_yellow()
show_2gr(img, yellow, 'Original image', 'Yellow filter applied')
Finally, I've created 2 binary images using color and gradient features. Color image (function manage_color, class_image.py, lines 222-235) I've createdby taking the thresholded saturation channel (150,255), adding (binary OR) yellow mask, substracting thresholded hue (25,180) and adding thresholded level (200,255).
Gradient image (function manage_grad, class_image.py, lines 204-220) I've created by applying gradient in x direction, thresholded (25,55), substracted thresholded gradient direction (1.0721, pi/2), substracted thresholded gradient direction (0.0982, 0.3927), substracted thresholded hue (85,255), substracted hue (0,6).
The final binary image (function binary_th, class_image.py, lines 237-240) is a combination (logical OR) of color and gradient images.
show_2gr(img_obj.manage_color(), img_obj.manage_grad(), "Color thresholded", "Gradient thresholded")
binary = img_obj.binary_th()
show_2gr(img, binary, "Original image", "Binary thresholded image")
The combination of color and gradient thresholded binary images imroves robustnes of lane lines detection.
Below I've defined 2 functions for perspective transformation of the binary image to "bird-eye view" (warp: class_camera.py, lines 62-70) and back (unwarp: class_camera.py, lines 72-78). To warp an image I need to map 4 points of the original image into 4 points of the new image. To detect a transformation matrix, I've taken the coordinates of the points on the image of the flat road and kept in mind that lane lines are parallel there.
I should mention here that I don't save proportion here: the warped image is ~22.5m height and 7.4m width
show_2gr(img,cam.warp(img))
binary_warp = cam.warp(binary)
show_2gr(binary,binary_warp)
In order to detect lane lines and track their parameters, 2 more classes were defined.
Class Line (lines 454 - 474) keeps the attributes of the line like fit (polynomial coefficients, if the line was detected, radius, etc.).
Class Lane (lines 253-452) was implemented to detect initially and then track the lane lines. This class consists also from the pipeline method, which is used for video processing.
Initially, to detect the lines I'm using blind search method, when I'm using sliding windows from bottom of the image to top, using histogram local maximums as a starting point for the windows. On the next steps, I'm using the quick search, when I'm searching line's pixels in the area -/+ margin from the polynomial. Once the line was not found, I'm using blind search on the next step.
In order to define lane-line pixels, I'm calculating the histogram of the bottom half of the image and 2 local maximums: below the x center of the image and above (considering that the camera is located in the center of the car and car follows the lane).
I've defined the hist function (class_image.py, lines 70-77) which calculates the histogram (actually sums up all y pixels for each x pixel).
Below there is an example of warped image with lane lines and histogram, local maximum of which helps me identify 2 starting point of the lines.
histogram = hist(binary_warp)
plt.imshow(binary_warp,cmap='gray')
plt.plot(img.shape[0]-histogram,linewidth=2)
In order to find the lane lines pixels, I'm using the method of sliding windows. The idea of the method is to slide through the image from buttom to top (or from top y pixels to 0) with the rectangles same size and assign all the points which fall into rectangles to lane lines pixels.
In x axis the center of rectangles are maximums of histogram to the left and to the right from the center, the width of the rectangle is fix value equal to 2 margins.
On each step, where amount of the points, fallen into rectangle more than a threshold, the center of the rectangle switch to the point of the local average in x axis.
To define the coefficients of the polynomial, I'm using numpy's function np.polyfit, fitting it with the detected lane lines pixels.
Function first_lines returns 2 polynomials coefficients, which are used later to draw lines or to fill the road between lines with color.
I've defined fill_lane to draw a polygon between lane lines, filled with color. I'm using cv2.fillPoly function, fitted with the points of 2 polynomial, to do this.
As soon as I get lines pixels, I'm calculating the radius of curvature, which is described below:
The radius of curvature at any point x of the function x=f(y) is given as follows:
$$R_{curve}=\frac{(1+(\frac{dx}{dy})^{2})^{3/2}}{|\frac{d^2x}{dy^2}|}$$$$f'(y)=\frac{dx}{dy}=2Ay+B$$$$f"(y)=\frac{d^2x}{dy^2}=2A$$So, our equation for radius of curvature becomes: $$R_{curve}=\frac{(1+(2Ay+B)^2)^{3/2}}{|2A|}$$
Function curvature (class_image.py, lines 416-428) implements the calculation of the radius.
Below I'm detecting lane lines on the test images, provided by Udacity
url = '../CarND-Advanced-Lane-Lines/test_images/'
name = 'test*.jpg'
images = glob.glob(url+name)
for i,fname in enumerate(images):
lane = Lane()
img = cam.undist(load_img(fname))
res = lane.pipeline(img)
show_2gr(img,res,"{}".format(fname[-10:]))
On the next step I've defined function process_video (lines 476-486) and applyied my lane lines detection algorithm to the video, provided by Udacity. Outcome video is loaded to GitHub repository and available here: https://github.com/olegleyz/SDCND-p4-advanced-lane-finding/blob/master/project_video_out.mp4
In the current version of the project I'm applying basic steps in order to detect lane lines on the provided images and videos.
I've spent significant amount of time searching optimal thresholds. Despite this stategy might show good results in one conditions, it's not necessarily works in others. The algorythm of adjustable threshold is really appreciate for this task.
Currently I'm tracking the lines and applying lighter search after initial lines detection. To make the model more robust, sanity checks and independent lines tracking (if just one line was detected, generate the second line, knowing distance between lines from the previous frames, as well as curve) might be helpful.
The performance of the model still requires improvements. It takes significant amount of time to process the video which couldn't work in real implementation.